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ABSTRACT 


Air  Force  current  operations  continue  to  undergo  significant  changes  compelled 
by  decreasing  fiscal  appropriations,  aging  aircraft,  and  personnel  drawdown.  The  Air 
Force  must  effectively  improve  current  maintenance  operations  in  part  to  deal  with  these 
challenges.  This  study  will  explore  the  area  of  the  A- 10  aircraft  fleet’s  TF34-100  high- 
pass  turbo-fan  engine  sensor  data  to  seek  its  deterioration  modelling  and  prognostics 
capability.  In  futurity  this  will  allow  for  achievement  of  greater  confidence  in  predicting 
the  compressor  stall  which  leads  to  engine  performance  deterioration  and  a  costly  repair 
in  maintenance.  By  utilizing  an  innovative  method  to  forecast  the  probability  of 
compressor  stall,  according  to  individual  engine  sensor  data  which  has  recently  become 
available,  it  will  be  possible  to  achieve  significant  benefits  in  both  maintenance  planning 
and  mission  scheduling  (which  will  greatly  reduce  the  associated  costs  of  maintenance 
servicing). 
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A  METHOD  TO  PREDICT  COMPRESSOR  STALL  IN  THE  TF34-I00  TURBOFAN 
ENGINE  UTILIZING  REAL-TIME  PERFORMANCE  DATA 


I.  INTRODUCTION 


General  Issue 

The  aim  of  preventive  maintenance  is  to  reduce  the  number  of  unexpected  downtimes 
and  therefore,  the  number  of  unscheduled  maintenance  actions.  Unscheduled  maintenance  is  an 
undesirable  situation  in  which  a  failed  component  must  be  repaired  or  replaced  before  the  system 
can  return  to  service.  Such  unprecedented  failures  can  interrupt  delivery  schedules,  cause  further 
system  damage,  and  result  in  additional  monetary  burdens.  For  many  components,  maintenance 
may  best  be  carried  out  in  a  proactive  and  preventive  manner.  This  research  focuses  on  the 
application  of  preventive  maintenance  for  the  TF34-100  jet  engine  to  prevent  engine  compressor 
stalls  for  the  A- 10  aircraft.  Due  to  their  destructive  nature,  compressor  stalls  are  a  significant 
concern  in  axial  flow  compressor  jet  engines. 

A  compressor  stall  is  caused  by  air  approaching  the  compressor  blades  at  an  angle  greater 
than  their  stalling  angle  resulting  in  a  localized  disruption  of  the  airflow.  The  compressor  blades 
act  like  small,  cambered  wings  with  very  high  aspect  ratios;  like  high  aspect  ratio  wings,  the 
blades  have  relatively  low  stalling  angles.  Below  the  stalling  angle,  an  increase  in  the  angle  of 
attack  produces  a  proportional  increase  in  the  coefficient  of  lift.  But,  at  angles  beyond  the 
stalling  angle,  the  airflow  separates  from  the  upper  surfaces.  This  causes  a  rapid  decrease  in 
coefficient  of  lift,  with  the  departing  air  forming  a  highly  turbulent  wake  downstream  of  the 
blade.  This  turbulent  flow  then  moves  downstream  passes  over  the  stator  blades  behind  and  into 
the  next  row  of  rotors.  The  process  from  this  point  onwards  can  take  a  number  of  forms.  In 


1 


many  cases  the  turbulence  is  simply  flushed  through  the  engine,  such  that  only  one  or  two  blades 


stall.  In  this  case  there  will  be  few  if  any  outward  indications  that  a  stall  has  occurred.  In  the 
next  level  of  severity,  the  initial  stall  might  cascade  rearwards,  such  that  several  rows  of  blades 
are  affected,  before  the  subsequent  stages  regain  control  of  the  air.  In  this  case,  the  pilot  might 
detect  a  sudden  increase  in  internal  temperature  and  increased  vibration  from  the  engine.  The 
next  level  of  severity,  with  the  stall  cascading  all  the  way  back  through  the  compressor,  will 
obviously  increase  the  severity  of  the  outward  symptoms.  Some  engine  types  can  behave  in 
quite  bizarre  manners.  Some  US  military  turbofans  introduced  in  the  seventies  or  eighties,  for 
example,  exhibited  what  is  termed  a  locked-in  rotating  stall.  In  this  case  the  stalling  of  a  single 
blade  cascaded  not  downstream,  but  onto  the  next  rotor  blade  on  that  disc.  This  caused  a  small 
pocket  of  stall  to  rotate  in  the  opposite  direction  to  the  engine,  but  at  about  half  of  the  engine’s 
revolutions  per  minute  (RPM).  The  engine  would  continue  to  run  and  produce  some  thrust. 

A  blade's  stalling  angle  is  not  entirely  fixed.  Structural  damage  caused  by  the  ingestion 
of  hard  objects,  sand,  and  de-icing  fluid  can  all  reduce  the  stalling  angle.  Engines  tend  to  stall 
more  easily  as  they  age,  or  if  their  compressors  become  iced  or  dirty.  The  degree  to  which  the 
engine  is  able  to  control  its  airflow  increases  from  front  to  rear.  This  is  because  the  first  row  of 
rotor  blades  must  accept  whatever  airflow  they  meet,  whereas  the  subsequent  rows  receive  air 
from  the  blades  ahead  of  them. 

The  most  common  method  of  stall  prevention  is  the  use  of  variable  angle  inlet  guide 
vanes  (VIGV).  These  are  an  additional  row  of  non-rotating  blades,  immediately  ahead  of  the 
first  row  of  rotors.  By  changing  the  pitch  angle  of  these  vanes,  the  angle  of  the  incoming  airflow 
is  varied  to  ensure  that  the  angle  of  attack  of  the  first  row  of  rotors  is  always  less  than  their 
stalling  angle.  This  method  is  employed  in  the  TF34-100  A- 10  engines. 
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Problem  Statement 


Until  recently,  sufficient  data  has  not  been  available  to  elicit  a  predictive  maintenance 
program  for  the  A- 10.  However,  a  recent  modification  to  the  A- 10  data  accumulation  ability  has 
made  available  a  broad  range  of  sensor  data.  The  current  A- 10  compressor  stall  maintenance 
process  is  to  apply  corrective  maintenance  after  a  stall  has  occurred.  This  reactive  process  can 
produce  many  issues,  not  the  least  of  which  is  safety  to  the  pilot  when  a  compressor  stalls  during 
flight.  Developing  a  predictive  preventative  maintenance  process  for  compressor  stall,  with  the 
recent  additional  aforementioned  engine  sensor  data,  would  greatly  benefit  the  A- 10  program. 

A- 10  engine  SMEs  that  analyze  compressor  stall  and  other  engine  problems  are  using 
fault  event  statistical  data,  in  an  attempt  to  derive  some  form  of  a  logistical  strategy  planning. 
Some  engine  studies  examine  the  engine  performance  deterioration  data  to  develop  a  type  of 
preventive  maintenance  routine.  (See  articles  listed  in  the  biography.)  There  is  an  apparent  lack 
of  a  predictive  method  for  compressor  stall  (or  similar  problems)  concerning  each  individual 
engine,  which  method  would  be  based  upon  the  associated  individual  engine’s  real  time  sensor 
data  of  that  particular  turbofan  engine. 

Research  Flow 

This  study  explores  the  relationship  between  real-time  engine  sensor  data  to  engine 
compressor  stalls  to  develop  a  quantifiable  algorithm  from  this  engine  sensor  data  which  can  be 
applied  toward  predictive  preventative  maintenance. 

The  biggest  initial  question  is  if  a  connection  between  the  compressor  stall  fault  events  to 
the  particular  parameters  of  engine  performance  sensor  data  can  be  derived.  Numerous 
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discussions  with  engine  subject  matter  experts  (SMEs)  and  engine  maintenance  SMEs, 
concluded  that  compressor  stall  fault  events  would  be  most  likely  associated  with  the  compressor 
discharge  pressure  (PS3),  turbo  discharge  pressure  (PT5),  and  variable  geometry  (VG). 

Typical  A-10  engine  SMEs’  primary  investigations  and  discussions,  suggests  that  the  VG 
parameter  is  the  leading  indicator  for  the  deterioration  of  the  engine  control  mechanism  which,  in 
turn,  is  the  major  factor  in  the  development  of  compressor  stall.  The  VG  measure  was  created  by 
the  engine  manufacturer  General  Electric  to  assess  the  discrepancy  between  the  theoretical  value 
of  the  guide  vane  positioning,  (the  ideal  value  for  an  engine  to  perform  under  pristine  working 
environmental  conditions)  and  the  actual  value  as  measured.  Current  VG  calculation  algorithm 
does  not  show  the  “theoretical  value”  but  only  the  difference  between  the  theoretical  value  and 
the  actual  value. 

When  the  A-10  engine  and  engine  maintenance  SMEs  perform  a  compressor  stall  case 
study,  they  will  discount  any  VG  value  out  of  +1.5o  as  a  reference  point  to  future  problems.  How 
the  VG  values  associated  to  a  compressor  stall  fault  event  and  to  what  extent  of  its  degree  is 
unknown  so  far.  Also  it  is  needed  to  point  out  that,  it  is  an  experiment-derived  formula  for 
General  Electric  to  calculate  the  VG  value  from  three  sensors.  Those  sensors  are  Compressor 
Inlet  Temperature  (T2C)  sensor  (also  called  GIT),  the  Core  Speed  sensor  (NG),  and  the  Inlet 
Guide  Vane  sensor  (IGV). 

The  indication  of  this  research  is  that  the  VG  data  contains  information  associated  to  the 
engine  stall  deterioration  mechanism.  By  modeling  this  VG  data  of  individual  engines,  the 
resulting  modeling  coefficients  will  be  representatives  of  that  deterioration  mechanism.  By 
collecting  multiple  flight  data  from  multiple  engines,  the  subsequent  regression  data  model  will 
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reveal  the  probability  of  a  compressor  stall  event  statistically  related  to  such  a  deterioration 


mechanism. 


The  locations  of  certain  engine  sensors  are  shown  in  Figure  1:  Locations  illustrated  for 


three  interested  sensors  (T2C,  NG  and  IGV). 
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Figure  1:  Locations  illustrated  for  three  interested  sensors  (T2C,  NG  and  IGV). 


The  recent  increase  in  available  engine  sensor  data  from  the  A- 10  aircraft  includes  many 
different  types  of  sensor  data  that  vary  in  type  and  location.  This  research  will  particularly 
concentrate  on  compressor  stall  symptoms,  which  make  up  the  majority  cost  of  repair  / 
maintenance  for  A- 10  engines,  by  identifying  a  pattern  of  a  precursor  or  constant  in  the  now 
available  recorded  engine  sensor  data. 

Two  types  of  compressor  stall  are  observed  in  practice  where  each  type  is  characterized 
by  a  distinct  set  of  sensor  conditions. 
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Table  1:  Types  of  compressor  stalls 


WESA 

AOA  in  Envelope 

PLA 

>  12° 

>  12° 

PS3  (%  drop  over  500ms) 

>  25% 

>  50% 

NF 

7% 

AOA 

in  envelope  for  mach  number 

in  envelope  for  mach  number 

Photos  illustrating  the  physical  locations  of  several  sensors  are  shown  in  Figures  3-6 

below. 


=03-002 


Figure  2:  Physical  location  of  the  engine  sensor  Compressor  Inlet  Temperature  (T2C). 
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Figure  3:  Physical  location  of  engine  sensor  PL  A. 


Figure  4:  Physical  location  of  the  engine  sensor  IGV. 
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Figure  5:  Physical  location  of  the  engine  sensor  PT5  and  PS3. 


Figure  6:  Physical  location  of  the  engine  sensor  for  Front  Frame  Accelerometer. 
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Investigative  Questions 

How  to  sample  the  flight  sensor  data? 

A  large  A- 10  engine  Real  Time  Engine  Data  (RTED)  database  is  needed,  one  with  consistency 
and  reliability  for  modeling  purposes. 

How  to  derive  a  response  indicator  (independent  variable)  in  order  to  establish  a 
regression  model? 

There  are  many  faults  within  the  RTED  data,  how  to  identify  the  Compressor  Stall  indicator? 

Is  the  AutoRegressive  Integrated  Moving  Average  (ARIMA)  model  adequate  for  the 
sensor  data? 

What  order  of  the  ARIMA  model  should  be  used?  There  is  a  need  to  understand  which 
type  of  sensor  data  to  use,  one  sensor  or  many  sensors?  In  the  case  of  many  sensors,  what 
transformation  of  the  data  is  required  to  avoid  a  multivariate  scenario? 

How  to  predict  the  fault  event  probability  by  utilizing  an  ARIMA  model? 

Should  the  forecasting  function  of  the  ARIMA  model  be  used  to  set  an  Upper  Confidence 
Interval  (UCE)/Eower  Confidence  Interval  (ECE)  band,  to  indicate  the  warning  level  of 
probability  of  future  Compressor  Stall  fault  event?  Can  a  probability  model  be  established  by 
retrieving  the  Auto  Regression  (AR)  and  Moving  Average  (MA)  coefficients  from  the  selected 
ARIMA  models? 

Methodology 

Engine  sensor  data  is  Time  Series  data:  raw  data  that  is  collected  from  the  General 
Electric  TE34-100  engine  sensors  after  each  sortie,  using  a  software  program  entitled  “ASIST” 
developed  by  Southwest  Research  Institute  (SwRI).  This  study  will  fit  the  raw  flight  sensor  data 
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with  an  ARIMA  model.  Utilizing  the  outcomes  from  the  modeling,  a  probability  model  will  then 
be  fitted  in  order  to  determine  the  chance  of  compressor  stall  occurring. 

To  ensure  that  a  high  fidelity  state  is  maintained  within  the  study,  the  impact  of  two 
known  variance  issues  will  be  addressed  and  reduced.  First  is  the  sensor  variance.  This  is  an 
inherent  variance  associated  with  the  nature  of  real  world  engineering  data.  Secondly  is  the 
modeling  variance.  This  variance  is  related  to  the  adequacy  of  the  proposed  model. 

The  following  strategic  steps  were  conducted  during  this  study’s  research.  Sample  data 
was  collected  to  conduct  a  primary  investigation.  Existing  software  analysis  tool,  entitled 
ASIST,  is  used  to  detect  the  fault  events  after  each  flight.  A  large  amount  of  effort  was 
employed  to  filter  out  or  reduce  the  extent  of  sensor  failure,  sensor  normal  and  abrupt  variance. 
Through  trial  and  error  it  was  found  that  ARIMA  modeling  is  appropriate  for  the  VG  data.  This 
required  many  attempts  to  identify  an  adequate  model  and  the  necessary  programming  for  the 
ARIMA  model.  These  results  yielded  the  equation: 

Xf-  —  S  +  ARiXf-_i  +  AR2Xf-_2  +  — f  ^Rp^t-p  +  “  MA2Af-_2  - - ^^q^t-q 

Where  X^  is  the  VG  value  at  the  time  t,  ARp  is  the  AutoRegression  (AR)  coefficient,  and  MA^  is 
the  Moving  Average  (MA)  coefficient.  This  model  is  denoted  as  arima(p,0,q). 

Next  a  large  population  data  was  collected  to  develop  the  associated  model.  This  was  followed 
by  identifying  the  model  variables  and  selecting  an  ARIMA  model.  The  estimate  parameters 
were  identified  for  first  order  differenced  VG  data.  Then  the  model  was  checked  for  adequacy. 
Finally  a  probability  model  was  fitted  for  compressor  stall  symptoms.  The  resulting  equation 
was  developed  as: 

p  q 

p{EngineStall\Nfiigf,ts)  =  po +  ^Pi  *  +  ^  Pp+j  *  MAj  +  e 

i=l  j=l 
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Where  indicates  number  of  flights  data  used,  currently,  N  =1,2,  ...1 .  ARi  is  the  AR 

coefficients  (/  =  1,  2,  . . .  p)  and  MAj  is  the  MA  coefficients  (j  =  1,2,  ...  q). 

This  Probability  Model  of  Compressor  Stall  Fault  Event  is  based  on  the  data  as  described  below 


fy.] 

/AR,, 

AR2^ 

ARy 

Pi 

MA^^ 

-  MA,\ 

ym 

AR,^ 

AR22 

AR^J 

P2 

MA,^ 

MA22 

-  MA,^ 

ym+1  J 

\  yn  / 

'  AR2^  . 

•  •  ARy 

^^In 

MA2^ 

...  MA,J 

Where  the  response  variable  (independent  variable)  (yi,  y2,  ...  ym)  indicates  compressor  stall  fault 
event  did  happened  for  engine  #1  to  engine  #m,  and  (ym+i,  ■  •  -yn)  indicates  No  stall  fault  event 
happened  for  engine  #(m+l)  to  engine  #n;AR,A:  (/ =  1,2,  ...p;k=  1,2,  ...  n)  and  MAjkij  =  1, 

2,  ...  q.  A:  =  1,  2,  ...  n)  are  the  repressor  variables  (explanatory  variables)  which  are  AR 
coefficients  and  MA  coefficients  of  the  fitted  arima(p,0,q)  model  from  each  set  of  VG  data. 
Currently  m  =  6  and  n  =  28.  This  reflects  that  the  data  consists  of  14  aircraft  (equipped  with  a 
total  of  28  engines),  and  that  6  of  the  engines  have  compressor  stall  fault  events. 
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Assumptions/Limitations 

The  first  assumption  is  that  the  flight  sensor  data  file  is  consistent,  without  missing  one 
flight  as  a  whole  or  any  in-flight  data  points. 

The  second  assumption  is  that  the  VG  calculations  are  a  correct  discrepancy  function  of 
the  theoretical  value  compared  to  actual  value  for  the  engine  control  mechanism. 

For  the  third  assumption,  it  is  assumed  that  the  major  contributing  factor  to  a  compressor 
stall  fault  event  is  the  variable  geometry  (VG)  and  that  the  other  factors  are  minor. 

The  data  for  a  particular  engine  has  to  be  tracked  every  flight,  which  includes  events  of 
changing  the  engine  into  the  studied  aircraft.  In  other  words,  the  engine  number  needs  to  be 
consistent  for  any  study.  Thankfully  the  ASIST  software  is  very  useful  in  identifying  particular 
engine  series  numbers  contained  in  the  RTED  data  files.  For  the  purpose  of  this  study,  those 
RTED  data  files  have  to  be  converted  to  csv  files,  and  then  introduced  to  the  R  programming  of 
this  research. 

The  model  adequacy  of  the  developed  compressor  stall  Eault  Event  Probability  Model 
can  be  checked  by  the  theoretical  calculations  presented  in  this  study,  but  the  verification  of  the 
modelling  adequacy  check  for  the  real  world  data  is  not  validated  yet.  That  is,  the  forecasted  data 
from  the  proposed  compressor  stall  Eault  Event  Probability  Model  will  eventually  need  to  go 
through  a  verification  test  for  certain  duration  of  time  with  real  world  data,  such  as  12  to  24 
months,  to  check  the  accuracy  of  the  fitted  model  with  the  A-10  fleet  of  engines. 

Implications 

If  a  more  proactive  method  to  conduct  maintenance  can  be  achieved  for  the  A-10  TP34- 
100  engine,  then  a  great  amount  of  benefit  will  be  realized  not  only  in  cost  savings  for  engine 
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repairs,  but  also  in  aircraft/pilot  safety  and  longer  engine  lifetimes.  That  is,  this  method  can  save 
many  millions  of  dollars,  increase  the  engine’s  reliability,  and  ensure  a  greater  safety  for  A- 10 
pilots. 

Preview 

The  Air  Force  current  A- 10  engine  maintenance  program  for  stalls  is  reactionary  to 
physical  manifestations.  This  paper  will  provide  a  more  proactive  modeling  that  can  lead  to  a 
predictive  preventative  maintenance  program.  This  involves  the  examination  of  a  large  body  of 
A- 10  engine  data  maintained  by  General  Electric  (GE).  The  resultant  modeling  of  the  data  will 
be  utilized  to  present  a  compelling  option  to  achieve  significant  benefits  in  both  maintenance 
planning  and  mission  scheduling,  which  will  reduce  the  associated  costs  of  maintenance 
servicing  as  well  as  increasing  the  level  of  safety  for  the  pilots  of  the  A- 10. 
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II.  RESEARCH  AND  DEVELOPMENT  PROCESS 


Overview 

This  chapter  details  the  benefits  of  utilizing  predictive  preventative  maintenance  to 
enhance  the  current  A- 10  engine  maintenance  program.  A  brief  background  on  the  context  areas 
is  presented,  followed  by  a  more  detailed  review  of  literature  pertaining  to  the  process  employed 
to  develop  the  analysis.  This  will  be  the  context  used  for  the  methodology  and  data  analysis 
chapters.  The  intent  is  to  present  the  reader  a  clear  understanding  of  the  statistical  processes 
required  to  develop  a  model  of  the  engine  data. 

The  “micCID”  is  a  legacy  cartridge  technology  used  by  the  U.S.  Air  Force  to  record  real¬ 
time  engine  data  (RTED)  from  various  on  board  sensors.  The  data  from  each  engine  sensor  is 
recorded,  by  the  micCID,  at  a  baseline  rate  of  IHz,  but  if  an  abnormal  behavior  is  detected  this 
rate  is  increased  to  lOOHz.  The  data  recorded  at  lOOHz  are  then  saved  as  separate  files  from  the 
data  recorded  at  IHz.  After  each  flight,  the  A- 10  ASIST  software  is  used  to  download  the  data 
from  the  micCID.  Some  RTED  files  are  uploaded  by  A-10  aircraft  deployed  base  technical 
personnel  to  a  website  called  Joint  Reliability  Availability  Management  System  (JRAMS).  The 
RTED  data  used  for  this  research  were  downloaded  from  JRAMS  website  and  also  delivered 
from  a  hard  disk  provided  by  Nellis  AEB  AEETS  team. 
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Processing  the  RTED  Data 


Overview 

The  engine  RTED  file  is  downloaded  from  the  aircraft,  and  ASIST  analyzes  the  sensor 
data  to  find  a  fault  event  (engine  stall  and  others).  In  particular,  the  ASIST  tool  is  used  to 
discover  a  fault  event  where  the  parameters  match  the  definition  of  a  compressor  stall  fault 
incident.  For  example,  the  screenshot  in  Figure  7  shows  the  fault  event  code  “RA75”,  indicating 
that  while  the  aircraft  was  airborne,  the  right  engine  had  an  "Engine  Stall  with  AoA  in  Envelope” 
event.  Note,  that  this  event  is  initially  termed  a  “Modified  TEMS  Stall”  in  ASIST.  This  event 
was  then  assigned  a  numbered  code  of  “75”  in  the  engine  SME’s  archive.  The  ASIST  software 
also  records  the  engine  serial  numbers  (ESN)  as  shown  in  Figure  7.  Tracking  the  ESN  is  critical 
to  this  research  to  ensure  that  the  same  engine  is  selected  for  the  seven  flights  of  the  analysis. 

•<  ASIST  -  [RTED^  800204-2586-0.rtedl  ~ 
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Flight  Data 
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Figure  7:  Screenshot  of  ASIST  opening  RTED  data  file. 
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The  detailed  data  file  sampling  process  is  illustrated  in  Figure  8.  This  figure  illustrates 
the  possibility  of  multiple  sets  of  flight  data  files  to  be  selected  from  one  particular  aircraft, 
because  some  aircraft  have  more  than  one  recorded  compressor  stall  events.  Since  it  is  desired  to 
obtain  a  broadband  fleet  representation,  as  much  as  possible,  and  in  order  to  avoid  repeatable 
errors  from  sample  data  (which  would  be  the  case  if  multiple  flight  data  samples  were  used  from 
the  same  aircraft),  one  set  of  sample  data  has  been  selected  for  each  aircraft  within  this  study. 


Three  (3)  Situations  as  illustrated  below. 
ONLY  SAMPLE  ONCE  PER  EACH  AIRCRAFT 
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Figure  8:  Sample  the  RTED  data  files  from  all  available  flight  data  -  Nellis  hard  disk  and 

JRAMS  website. 


Observe  and  choose  all  available  RTED  data  for  the  continuous  7  flights  of  data  as 
illustrated  in  Figure  8  above. 
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Use  an  R  program  to  calculate  the  VG  values  and  save  the  results  into  two  separate  data 
files  for  left-side  and  right-side  engines  respectively.  Refer  to  existing  GE  documents  for  a 
definition  of  VG  and  its  experimental  meanings. 

When  checking  the  T2C  sensor  data,  consider  any  data  point  with  a  value  less  than  -15°C 
or  larger  than  -i-95°C  as  a  “bad”  data  point  or  “outlier”.  The  reasons  for  this  is  that:  the  engine 
SMEs  advocate  that  any  data  point  out  of  the  range  of  -15°C  to  -i-95°C  is  not  real,  and  that 
primary  observations  and  investigations  reveal  that  removing  these  particular  T2C  sensor  data 
points  leads  to  better  ARIMA  model.  This  rational  is  applied  to  both  engine  T2C  sensors  in  order 
to  eliminate  bad  T2C  sensor  data  impact.  If  the  count  of  “bad”  data  points  from  the  right  engine 
is  larger  than  left  engine,  then  the  T2C  sensor  data  from  the  left  engine  will  be  used  instead.  If 
the  count  of  “bad”  data  points  from  the  left  engine  is  larger  than  right  engine,  then  the  T2C 
sensor  data  from  right  engine  will  be  used  instead. 

Because  the  physical  distance  between  the  left  engine  air  inlet  and  the  right  engine  is 
approximately  10.5  feet,  the  left  engine  T2C  sensor  values  will  normally  be  equal  or  very  close 
to  the  right  engine  T2C  sensor.  When  “bad”  T2C  sensor  data  is  detected,  its  entire  T2C  data  will 
be  eliminated  as  a  whole,  then  the  copy  of  the  “good”  data  points  from  another  engine  are  used 
for  the  calculations.  (Please  see  Eigure  12.) 

When  the  calculated  VG  value  is  larger  than  20  degrees,  while  it’s  T2C  >  23.8°C,  a  VG 
calculation  formula  with  T2C  <23.8‘’C  is  applied  as  recommended  by  the  Engine  SMEs  because 
the  T2C  delay  has  most  probably  occurred  at  that  time.  The  original  VG  formula  from  GE 
contains  four  regions  where  VG  is  not  defined.  (Please  reference  Error!  Reference  source  not 
found.)  When  the  NGC  value  is  larger  than  56. 1798  for  its  full  range  of  T2C  values,  and  when 
the  NGC  is  larger  than  93.3933  while  T2C  value  is  larger  than  37.7°C,  the  VG  value  has  not 
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been  defined,  and  they  are  marked  with  -11,-12,  -13  and  -14  in  the  developed  R  scripts 
according  to  the  region  in  which  the  data  falls  into.  Those  marked  data  points  will  eventually  be 
considered  as  “outliers”  and  will  be  eliminated  from  further  calculation  in  the  modeling  process. 
The  impact  of  this  undefined  data  point  elimination  and  the  percentage  of  data  points  kept  in 
calculated  process,  is  shown  in  Figure  9.  The  worst  case  scenario  indicates  that  only  0.663%  of 
the  data  points  are  eliminated  and  that  the  impact  of  these  undefined  data  point  elimination  is 
minor. 

VG  time  series  data  is  generated  by  combining  flights  1  (one)  through  7  (seven).  A  first 
order  differenced  VG  time  series  data  was  then  used  to  develop  an  arima(p,0,q)  model  in  R. 

A  compressor  stall  fault  event  probability  model  is  created  by  deriving  the  associated  AR 
and  MA  coefficients,  and  the  model  adequacy  of  the  linear  regression  model  (LRM)  is  checked 
with  well-known  classical  theoretical  calculations.  That  is,  an  R  build-in  function 
plot{lm{statsY)  is  used,  which  employs  four  calculations  for  plotting:  first.  Residuals  versus 
Fitted  Values;  second.  Normal  Probability  Plot  of  Standardized  Residuals;  third.  Squared  Root  of 
Standardized  Residuals  versus  Fitted  Values;  and  four.  Leverage  versus  Standardized  Residuals. 

For  the  ARIMA  model,  it  is  necessary  to  check  for  stationarity  and  seasonality  of  the 
data.  Stationarity  means  that  the  data  has  a  mean  of  zero  (or  very  closed  to  zero).  Seasonality 
means  the  data  has  periodic  fluctuations.  This  research  did  not  observe  any  seasonality  for  the 
data,  thus  seasonality  was  not  considered  in  the  ARIMA  model.  Also,  the  first  order  differenced 
VG  data  demonstrated  a  good  stationary  quality,  therefore,  the  version  of  ARIMA(p,0,q)  used 
was  without  any  seasonality  feature.  To  mathematically  check  the  stationarity  of  first  order 
differenced  VG  data,  a  determination  is  made  as  shown  in  Figure  10.  From  this  calculation,  a 
range  of  amplitude  0.00002  to  0.00014  is  observed  for  mean  values  from  one  flight  data  to 
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seven-combined-flights-data.  This  statistic  calculation  provides  a  confidence  check  that  any 
mean  value  impacts  to  the  final  Linear  Regression  Model  (LRM)  will  be  minor;  therefore  our 
final  Probability  Model  of  Compressor  Stall  Fault  Event  does  not  include  a  mean  value  as  a 
regressor. 


MEAN  VALUES  OF  TOTAL,  BAD  AND  PERCENTAGE  OF  GOOD  VS  NUMBER  OF  FLIGHTS 


Figure  9:  Percentage  of  “good”  data  points  over  number  of  combined  flights. 
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Mean  Values 


Mean  Values  of  First  Order  Differenced  VG  Data  [mean(dlff(ts(VG)))] 


Figure  10:  Statistical  check  for  the  mean  of  first  order  differenced  Variable  Geometry 
(VG)  data.  A  range  of  amplitude  0.00002  to  0.00014  is  observed  from  this  plot,  thus  it  is 
considered  that  all  first  order  difference  VG  data  is  stationary. 
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Figure  11:  Time-series  plots  of  multiple  sensor  data  captured  from  the  left  and  right 
engines  of  an  aircraft  78-0657  during  a  single  flight,  (top  row  -  T2C,  middle  row  NG, 
bottom  row  -  IGV).  All  sensor  data  points  appear  good. 
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Figure  12:  Time-series  plots  of  multiple  sensor  data  captured  from  the  left  and  right 
engines  of  aircraft  78-0657  during  a  single  flight,  (top  row  -  T2C,  middle  row  NG,  bottom 
row  -  IGV)  Left  engine  has  “bad”  T2C  sensor  data.  In  this  kind  of  case,  it  is  necessary  to 
copy  the  good  (right  engine)  T2C  data  and  apply  it  to  this  engine  since  the  engines  are 
physical  located  10.5  feet  apart  on  the  aircraft. 
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Check  the  ACF  and  PACF  of  VG  Data 


The  normal  practice  for  ARIMA  modeling  is  that,  at  first,  by  plotting  autocorrelation 
function  (ACF)  can  pre-determine  a  proper  MA  order  of  q  in  ARIMA(p,0,q).  Secondly,  by 
plotting  a  partial  autocorrelation  function  (PACF)  can  pre-determine  a  proper  AR  order  of  p  in 
arima(p,0,q). 


ts  (Left  VG)  1 1  Flight  Only  Differenced  ts(Left  VG)  1 1  Flight  Only 


Figure  13:  One  flight  Variable  Geometry  (VG)  data.  Original  sequence  plot;  First  order 
differenced  time  series  plot  (stationarity  for  the  ARIMA  model);  ACF  plot  and  PACF  plot 

for  first  order  differenced  ts  VG  data. 
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ts  (Left  VG)  1 4  Flights  Combined 


Differenced  ts(Left  VG)  1 4  Flights  Combined 


ACF  of  Differenced  ts(Left  VG)  1 4  Flights  Combined 


PACF  of  Differenced  ts(Left  VG)  1 4  Flights  Combined 


Lag 


Lag 


Figure  14:  Combined  four  flights  Variable  Geometry  (VG)  data.  Original  sequence  plot; 
first  order  differenced  time  series  plot  (stationary  for  the  ARIMA  model);  ACF  plot  and 
PACF  plot  for  first  order  differenced  ts  VG  data. 
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ACF  of  Differenced  ts(Left  VG)  1 7  Flights  Combined 


PACF  of  Differenced  ts(LeftVG)  1 7  Fiights  Combined 


Lag 


Lag 


Figure  15:  Combined  seven  flights  Variable  Geometry  (VG)  data.  Original  sequence  plot; 
first  order  differenced  time  series  plot  (stationary  for  the  ARIMA  model);  ACF  plot  and 
PACF  plot  for  first  order  differenced  ts  VG  data. 


The  ACF  plot,  such  as  Figure  16  shown  below,  shows  a  quick  decay  pattern  along  the  x- 
axis  (number  of  lags).  After  the  lag  =  6  point,  all  points  are  within  a  95%  confidence  interval 
band.  Especially  after  the  lag  =  2  points,  the  amplitude  is  very  small  (<0.1).  This  demonstrated 
plot  suggests  that  having  a  value  of  q  =  6  would  be  very  adequate  for  the  model.  Even  using  a 
lower  value  of  q  =  2  may  be  adequate  enough  when  using  arima(p,0,q).  Similar  ACF  studies  are 
carried  out  for  many  combined  flight  data.  The  findings  are  very  similar  to  those  indicated  here. 
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ACF  of  Differenced  ts(Left  VG)  |  7  Flights  Combined 

Plotted  03Mar201 5  13:52:25 


Figure  16:  ACF  plot  for  seven-combined-flights-data,  which  shows  a  quick  decay  pattern 
along  the  x-axis  (number  of  lags).The  blue  dashed  lines  are  95%  confidence  interval. 

The  PACF  plot,  such  as  Figure  17  shown  below,  indicates  a  not-so-quick  decay  pattern 
along  the  x-axis  (number  of  lags).  After  the  lag  =  20  point,  all  points  are  within  a  95% 
confidence  interval  band.  Especially  after  the  lag  =  16  points,  the  amplitude  is  very  small 
(<0.025).  This  demonstrated  plot  suggests  that  setting  p  =  20  would  be  suitable  for  the  model, 
and  that  setting  p  =  16  may  be  adequate  enough  when  using  arima(p,0,q).  Similar  PCF  studies 
are  carried  out  for  even  more  combined  flight  data.  The  findings  are  very  similar  as  to  those 
indicated  here. 
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PACF  of  Differenced  ts(Left  VG) 


PACF  of  Differenced  ts(Left  VG)  1 7  Flights  Combined 

Plotted  03Mar201 5  13:52:25 


Lag 


Figure  17:  PACF  plot  for  seven-combined-flights-data,  which  indicates  a  not-so-quick 
decay  pattern  along  the  x-axis  (number  of  lags).  The  blue  dashed  lines  are  95%  confidence 

interval. 
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Check  arima(p,0,q)  Performance  over  Sample  Data 

A  first  order  difference  was  formed  from  the  VG  data  files  for  the  sampled  28  engines, 
including  the  6  engines  with  compressor  stall  faults  using  the  flight  data  immediately  prior  to  the 
fault  flight  for  that  particular  aircraft.  The  aircraft  serial  number  (ASN)  and  engine  ID  were 
checked  for  consistency  to  ensure  that  this  was  the  case.  Then  1  to  7  combined-flights-data  were 
formed  for  the  specified  ASN  aircraft.  The  reason  to  form  a  multiple  flight-combined-  data  is  to 
attempt  to  achieve  an  “early  alert”. 

An  R  script  was  developed  to  check  a  specified  ARIMA(p,0,q)  model's  performance 
based  on  4  (four)  parameters. 

First,  check  the  Sigma'^2  parameter  by  using  the  R  code: 
arima{x,  order  =  c(p,  0,  q)),  optim.  method  —  "Nelder  —  Mead'')$sigma2 
Where  Sigma^2  stands  for  the  maximum  likelihood  estimate  (MLE)  of  the  innovations 
variance.  The  difference  between  the  expected  mean  at  time  t,  given  the  time  series  prior  to  t,  and 
the  actual  value  is  called  the  innovation.  Measuring  the  variance  of  the  innovation  will  give  you 
a  better  idea  of  how  "noisy"  the  process  is. 

Second,  check  the  Log  —  Likelihood  parameter  by  using  R  code: 

arima{x,  order  =  c(p,  0,  q),  optim.  method  =  "Nelder  —  Mead")$loglik 
Where  Log  —  Likelihood  stands  for  a  logarithm  of  likelihood  function.  In  statistics,  a 
Likelihood  function  (often  simply  the  Likelihood)  is  a  function  of  the  parameters  of  a  statistical 
model.  The  likelihood  of  a  set  of  parameter  values,  0,  given  outcomes  x,  is  equal  to  the 
probability  of  those  observed  outcomes  given  those  parameter  values,  that  is: 

c(e\x)  =  P{x\e) 
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Third,  check  the  AIC  parameter  by  using  R  code: 

arima(x,  order  =  c(p,0,q),optim.  method  =  "Nelder  —  Mead'')$aic 

Where  AIC  stands  for  Akaike  Information  Criterion,  which  is  a  measure  of  the  relative  quality  of 

a  statistical  model  for  a  given  set  of  data. 

Fourth,  check  the  Percentage  of  Significant  Coefficients  (PSC)  parameter. 

PSC  represents  the  percentage  of  significant  coefficients  among  the  AR  coefficients  and  MA 

coefficients  of  the  fitted  arima(p,0,q)  model.  Let  x  be  the  first  order  differenced  ts  data  of  VG, 

x.fit  be  the  fitted  arima(p,0,q)  model.  That  is: 

x.fit  <  —  arima{x,  order  —  c(p,0,q),optim.  method  —  "Nelder  —  Mead") 

If  one  of  the  coefficients  of  the  fitted  arima(p,0,q)  model  x./it$coe/[A:]  (k  =1,2,...  (p+q)) 

meet  following  condition,  it  is  considered  as  significant. 

X.  fit$coef[k] 
x.fit$var.  coef[k] 

In  R  programming  environment,  x.  fit$coef[k]  is  the  coefficients  of  the  resulting  arima(p,0,q) 
model,  where  x.  fit$coef[k]  be  AR  coefficients  while  k=  1,2,  ...  p,  and  x.  fit$coef[k]  be  MA 
coefficients  while  k=  (p+1),  (p+2),  ...  (p+q).  Correspondingly,  x.  fit$var.coef[k]  be  the 
estimated  variance  of  coefficient  x.  fit$coef[k] . 

Therefore,  the  PSC  is  calculated  as: 


>  1.96 


PSC  = 


Sum  of  the  Number  of  Significant  x.fit$coef[k] 

(p  +  q) 


The  following  is  a  demonstration  of  the  parameter’s  performance  for  the  various  ARIMA  models 
and  the  different  sampling  data: 
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For  example,  process  the  sample  data  of  Figure  18,  is  fit  to  an  arima(12,0,4)  model  for  the  first 
order  of  VG  data,  by  use  of  an  R  script,  the  coefficients  of  arima(12,0,4)  is  calculated  as 
(-0.1815,  0.0002387,  0.01500,  0.03247,  0.02954,  -0.002241,  0.05759,  0.04050,  0.006447, 
0.01128,  0.01244,  0.02152,  -0.05026,  0.04650,  0.02096,  -0.01320) 

Where  the  first  12  coefficients  are  called  AR  coefficients,  and  the  last  4  coefficients  are  called 
MA  coefficients,  because  p  =  12  and  q  =  4  in  an  arima(p,0,q)  model  here. 

And  the  variance  of  the  coefficients  of  arima(12,0,4)  is  calculated  as 
(0.0003720,  -0.0002028,  -0.00002361,  -0.0002780,  0.00001439,  0.00003668,  0.00005502, 
0.00005025,  0.00005391,  0.00006573,  0.00003563,  0.00005364,  -0.0003380,  0.0003338,  - 
0.00001389,  0.0003176) 

Thus,  the  ratio  of  coefficient  to  variance  of  coefficient  ( - coefficient - 1  calculated  as 

(.variance  of  coefficient) 

(487.84,  1.18,  635.56,  116.78,  2053.07,  61.09,  1046.79,  805.91,  119.60,  171.66,  349.30,  401.21, 
148.69,  139.29,  1508.98,41.57) 

The  second  AR  coefficient  (AR2)  is  not  significant  because  its  ratio  of  coefficient  to  variance  of 

coefficient  valued  as  of  1.18.  The  rest  of  the  coefficients  of  arima(  12,0,4)  model  are  significant. 

Therefore  the  PSC  can  be  calculated  as 

Sum  of  the  Number  of  Siqnificant  x.  fit$coef\k]  15 

PSC  = - - - ^ ^ =  —  =  0.9375  =  93.75% 

(P  +  q)  16 

Lower  than  100%  indicating  some  coefficients  are  not  significant.  It  is  estimated  that 
compounding  factors  and  non-flight-state- separated  data  play  roles  to  make  some  of  the  arima 
model  coefficients  non-significant. 
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difflBLU]  ts[6LK]  -#3/28-4F 


Plotted  02Mar2015  1  3:54:29 


15000  20000 

Time  Sequence  (ASN=78-0671;  Er>gine=L) 


Figure  18:  Example  of  a  sample  data  (four-combined-flights)  to  be  used  to  explain  the 
concept  of  Percentage  of  Significant  Coefficients  (PSC)  of  arima(p,0,q)  model. 
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log-likelihoc 


The  following  plots  demonstrate  the  scan  study  results  for  various  arima(p,0,q)  model  applied  to 


1 -flight-data,  4-combined-flights-data  and  V-combined-flights-data. 


log-likelihood  versus  at1ma(p,0,q) 

RoOed  14Jan2015  08:25:36 .  Data  Size  11334 


%  of  Significant  coef[k]  vs  arima(p,0,q) 

(SignicanleoeJkMTotal  Coellloents(=[>^)).  PlotteO  l4Jan20l5  06:25:36 ,  Data  Size  11334 


p  in  arima(p.0,Q) 


Figure  19:  The  four  plots  combination  of  the  arima  (p,0,q)  model  where  p=l-16  and  q=l-6. 
This  is  for  one  flight  Variable  Geometry  (VG)  data.  These  plots  are  for  Sigma'll; 

Log  —  Likelihood;  AIC  and  Percentage  of  Significant  Coefficients  (PSC). 
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log-likelihoc 


sigma''2  versus  arima(p,0,q) 


AlC  versus  arima(p,0,q) 


10 

p  in  arima(p,0,q) 

log-likelihood  versus  at1ma(p,0,q) 


%  of  Significant  coef[k]  vs  arima(p,0,q) 


Figure  20:  The  four  plots  combination  of  the  arima  (p,0,q)  model  where  p=l-16  and  q=l-6. 
This  is  for  four  flights  Variable  Geometry  (VG)  data.  These  plots  are  for  Sigma'll; 
Log  —  Likelihood;  AIC  and  Percentage  of  Significant  Coefficients  (PSC). 
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log-likelihoc 


sigma''2  versus  arima(p,0,q) 


AlC  versus  arima(p,0,q) 


10 

p  in  arima(p,0,q) 

log-likelihood  versus  at1ma(p,0,q) 


%  of  Significant  coef[k]  vs  arima(p,0,q) 


Figure  21:  The  four  plots  combination  of  the  arima  (p,0,q)  model  where  p=l-16  and  q=l-6. 
This  is  for  seven  flights  Variable  Geometry  (VG)  data.  These  plots  are  for  Sigma'^2; 
Log  —  Likelihood;  AIC  and  Percentage  of  Significant  Coefficients  (PSC). 
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Performance  Summary  for  arima(p,0,q)  over  Various  Combined  Flights  Data 

This  study  uses  28  sample  engines,  and  have  1 -flight-data  up  to  7-combined- flight-data, 
for  a  total  of  196  (7x  28)  samples.  It  was  desired  to  determine  how  the  arima(p,0,q)  performed 
over  all  the  data  by  experimenting  with  9  arima(p,0,q)  models,  which  included  a  combination  of 
p  =  2,  4,  6  and  q=  12,14,16  in  arima(p,0,q).  All  csv  results  and  all  plots  have  been  retained  for 
future  reference  and  potential  revalidation.  A  total  of  22  samples  with  Negative  fault  (means  no 
compressor  stall  is  observed  for  next  immediate  flight)  and  6  samples  with  Positive  fault  (means 
at  least  one  compressor  stall  is  observed  for  next  immediate  flight)  have  been  examined 
respectively  for  their:  mean  values,  UCI  (Upper  Confidence  Interval)  values,  and  LCI  (Lower 
Confidence  Interval)  per  4  (four)  performance  parameters  mentioned  as  above  (sigma^Z, 
log  —  likelihood,  AIC  and  PSC),  versus  l-flight-data  to  7-combined- flight-data. 

The  legends  used  are:  Black  ~  Negative,  Red  ~  Positive,  Solid  Lines  ~  Mean,  Dashed  Lines  ~ 
UCI,  and  Dotted  Lines  ~  LCI. 

In  general,  the  confidence  interval  can  be  calculated  as 

a 

Confedence  Interval  —  x  ±  7^/2  ^ 

Where  x  are  the  samples,  x  is  the  mean  of  the  samples,  (!—«;)  is  the  confidence  level,  and  Zoc/2 
is  the  confidence  coefficient,  a  is  the  standard  deviation  of  the  samples,  and  n  is  the  sample  size. 
Accordingly,  the  following  R  codes  have  been  used  to  calculate  the  UCI  and  LCI  with  95% 
confidence  level 

UCI  <  —  mean  -I-  qnorm(.025)  *  sd  /  sqrt(Sample  Size  of  N  or  P) 

LCI  <  —  mean  —  qnorm(.02S)  *  sd  /  sqrt(Sample  Size  of  N  or  P) 
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ARIMA(14,0,4)  Sigma^2 


Figure  22:  Example  of  the  arima(14,0,4)  model  performance  parameter  (sigma''2)  over 

various  combined  flight  data. 
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ARIMA(14,0,4)  Loglikelihood 


Figure  23:  Example  of  the  arima(14,0,4)  model  performance  parameter  (log  — 
likelihood)  over  various  combined  flight  data. 


ARIMA(14,0,4)  Ate 


Figure  24:  Example  of  the  arima(14,0,4)  model  performance  parameter  (AIC)  over  various 

combined  flight  data. 
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ARIMA(14,0,4)  PSC 


Figure  25:  Example  of  the  arima(14,0,4)  model  performance  parameter  Percentage  of 
Significant  Coefficients  (PSC)  over  various  combined  flight  data. 


The  following  are  the  combined  charts  for  various  arima(p,0,q)  performance  parameters  runs. 
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Figure  26:  Example  of  the  arima(12,0,4)  performance  parameters  {sigma^2,  log  — 
likelihood,  AIC  and  Percentage  of  Significant  Coefficients  (PSC))  over  each  data. 


39 


logliK  sigma''^ 
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File#;  Plotted  07Feb201 5  18:27:37  File#;  Plotted  07Feb201 5  18:27:37 


ARIMA(14,0,4)  Loglikelihood 


ARIMA(14,0,4)  PSC 


File#;  Plotted  07Feb201 5  18:27:37  File#;  Plotted  07Feb2015  18:27:37 


Figure  27:  Example  of  the  arima(14,0,4)  performance  {sigma''2,  log  —  likelihood,  AIC 
and  Percentage  of  Significant  Coefficients  (PSC))  over  each  data. 
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ARIMA(16,0,6)  AlC 


Figure  28:  Example  of  the  arima(16,0,4)  performance  {sigma''2,  log  —  likelihood,  AIC 
and  Percentage  of  Significant  Coefficients  (PSC))  over  each  data. 
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Model  Adequacy  Check  for  arima(20,0,6)  Model 

An  examination  of  the  arima(20,0,6)  model  adequacy  was  conducted  by  the  following 
three  steps.  Similar  examinations  are  carried  out  for  other  arima(p,0,q)  models,  and  the  results 
are  very  consistent. 

First,  check  the  arima(20,0,6)  residuals  with  p-value  vs  lag  plot.  Max(lag)  =  35  shown  as 
illustrated.  For  1 -flight-data,  all  p-values  are  around  1  (0.92  ~1);  for  4-combined-flights-data,  lag 
(1,  25)  have  p-values  almost  equal  to  1;  for  7-combined- flight-data,  lag  (1,  23)  have  p-values 
almost  equals  to  1.  According  to  Box.  test{stats]  help  file,  the  p-value  represents  the  likelihood 
to  be  against  independent.  Therefore,  the  p-value  near  1.0  indicates  that  the  residual  is 
independent,  not  against  independence.  A  p-value  near  0  would  indicates  that  the  residual  is  not 
independent.  This  is  different  from  the  normal  meaning  of  p-value  for  Linear  Regression. 

The  R  codes  used  are: 

x.fit  <  —  arima(x,  order  =  c(j),0,q),optim.  method  =  "Nelder  —  Mead") 
p[i]  <  —  Box.  test {x.fit%residuals,  i,  type  =  'Ljung  —  Box')$p.  value. 

Second,  check  the  arima(20,0,6)  residuals  with  Classical  4  Plots:  residuals  in  order;  ACF 
of  residuals;  histogram  of  residuals,  and  normal  probability  plot.  The  run-order  of  residuals  does 
not  show  any  particular  pattern.  The  ACF  of  residuals  illustrates  only  a  first  order  is  significant 
(this  is  very  desirable).  The  Histogram  plot  shows  that  residuals  are  centered  around  zero  and 
that  all  are  distributed  within  the  (-l,-i-l)  region  (this  is  desirable  too).  The  Normal  Probability 
Plot  reveals  to  be  within  the  (-1.5,  -1-1.5)  region,  and  the  residuals  are  normally  distributed.  The 
ideal  situation  is  to  be  normally  distributed  within  a  (-2,-i-2)  region.  This  check  demonstrates  the 
goodness  of  the  residuals. 

The  R  codes  used  are: 
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plot  (x.  fit$residuals) 
acf  (x.  fit$residuals) 

hist{x.fit%residuals,  breaks  =  100,  prob  =  T)  \  line s {density {x.fit%residualsy) 
qqnorm{x.  fit@residuals)  \  qqline{x.fit@residuals). 

Third,  check  the  arima(20,0,6)  residuals  with  the  R  built-in  function  tsdiag{stats].  It’s  a 
combination  of  Standardized  Residuals,  ACF  of  Residuals,  and  p-values  of  Ljung-Box  Statistics. 
The  run-order  of  standardized  residuals  does  not  show  any  particular  pattern.  That  is,  the  ACF 
of  residuals  shows  only  first  order  is  significant  (this  is  good),  and  the  p-values  of  Ljung-Box 
statistics  showed  near  value  of  1 . 

The  R  code  used  is: 

tsdiag  {x.fit$residuals). 

The  following  are  some  example  plots  from  1 -flight-data,  4-combined-flights-data,  and  7- 
combined-flights  -data . 
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Ljung-Box  Test  for  ARIMA(20, 0,6)  Model  Residuals 

p<it=So«.tBsBx>fesi(lU3is.i.type=Viuna-Boi!’WDV3lU9.  PloPea  l5Jan2015  12  37  21 


Figure  29:  Model  Adequacy  check  -  one  flight.  The  ARIMA(20,0,6)  for  first  order 
differenced  ts  Variable  Geometry(VG)  data,  and  p-values  of  Ljung-Box  Test  versus  lag 

(max  of  35). 


Histogram  of  ARIMA(20,0,6)  Residuals 

Plotted  15Jan2015  12:37:21 


Normal  Probabily  Plot  of  ARIMA(20,0,6)  Residuals 
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Figure  30:  Model  Adequacy  check  -  one  flight.  The  arima(20,0,6)  for  first  order 
differenced  ts  Variable  Geometry(VG)  data,  in  Classical  4-plots. 
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Figure  31:  Model  Adequacy  check  -  one  flight.  The  arima(20,0,6)  for  first  order 
differenced  ts  Variable  Geometry (VG)  data,  using  R  built-in  function  tsdiag(). 


Ljung-Box  Test  for  ARIMA(20, 0,6)  Model  Residuals 

p<l)^oitest(iSresi(]uais.l.lrpe='Ljung-eo<’}Spvslus:  PloOed  15J3n2015  13 


lag 


Figure  32:  Model  Adequacy  check  -  four  flights.  The  arima(20,0,6)  for  first  order 
differenced  ts  Variable  Geometry(VG)  data,  and  p-values  of  Ljung-Box  Test  versus  lag 

(max  of  35). 
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Autocorrelation  of  ARIMA(20,0,6)  Residuals 


ARIMA(20,0,6)  Residuals 


Histogram  of  ARIMA(20,0,6)  Residuals 
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Figure  33:  Model  Adequacy  check  -  four  flights.  The  arima(20,0,6)  for  first  order 
differenced  ts  Variable  Geometry(VG)  data,  in  Classical  4-plots. 
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Figure  34:  Model  Adequacy  check  -  four  flights.  The  arima(20,0,6)  for  first  order 
differenced  ts  Variable  Geometry(VG)  data,  using  R  built-in  function  tsdiag(). 


Ljung-Box  Test  for  ARIMAi20, 0,6)  Model  Residuals 

p<l)^oitest(iSresiduais.l.lrpe='Ljung-eax’}Spvslue:  Ploaed15Jan2015  13  39  58 
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Figure  35:  Model  Adequacy  check  -  seven  flights.  The  arima(20,0,6)  for  first  order 
differenced  ts  Variable  Geometry(VG)  data,  and  p-values  of  Ljung-Box  Test  versus  lag 

(max  of  35). 
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Autocorrelation  of  ARIMA(20,0,6)  Residuals 
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Figure  36:  Model  Adequacy  check  -  seven  flights.  The  arima(20,0,6)  for  first  order 
differenced  ts  Variable  Geometry(VG)  data,  in  Classical  4-plots. 
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Figure  37:  Model  Adequacy  check  -  seven  flights.  The  arima(20,0,6)  for  first  order 
differenced  ts  Variable  Geometry (VG)  data,  using  R  built-in  function  tsdiag(). 
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Model  Adequacy  Check  for  Probability  Model  of  Compressor  Stall  Fault  Event 

The  R  built-in  plotting  function  plot{lm{statsY)  have  been  used  to  check  the  adequacy 
of  the  fitted  Linear  Regression  Model.  Furthermore,  and  have  been  retrieved  from  the 
results  of  R  built-in  function  lm{stats].  Additionally  the  Model’s  p-values  (not  the  p-values  of 
the  fitted  model’s  coefficients)  were  calculated  by  using  following  R  codes: 
fit.  Im  <  —  lm{Fault  ~  . ,  data  —  DATA,  na.  action  =  NULL) 

R^  <  —  summary (^f  it.  lm)$r.  squared 

^adj  <  “  summary  (fit.  lm)$adj.  r.  squared 

Standard  Error  <  —  summary  (fit.  lm)$sigma 

Modeljp  —  value  <  —  pf  (x[l],  x[2],  x[3],  lower .  tail  =  FALSE) 

Where  x  <  —  summary  (fit.  lm)$f  statistic;  pf  [stats]  is  the  F  probability  distribution 
function  built-in  R  programming. 

The  following  plots  show  the  adequacy  check  for  the  “best  fit”  based  upon  the  current 
selection  of  sample  data.  The  LRM  result  would  be  better  if  the  sample  size  were  increased. 
According  to 


n  —  1 


n  —  p  —  1 

where  n  is  the  sampling  size,  p  is  the  number  of  regressor  (=  p  -i-  q  in  arima(p,0,q)),  a  larger 
sample  size  would  be  very  helpful  to  obtain  a  larger  R^^j  if  the  same  R^  can  be  somewhat 
retained. 
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Figure  38:  Check  Linear  Regression  Model  (LRM)  adequacy  with  R  built-in  function 
plot(lm{stats}).  This  is  an  example  for  one-flight-data  with  arima(12,0,4). 
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Figure  39:  The  examination  of  Linear  Regression  Model  (LRM)  adequacy  based  on  various 
arima(p,0,q)  are  shown  here.  For  the  one-flight-data,  the  “best  fit”  is  arima(12,0,4). 
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Figure  40:  Check  Linear  Regression  Model  (LRM)  adequacy  with  R  built-in  function 
plot(lm{stats}).  This  is  an  example  for  two-combined-flight-data  with  arima(16,0,6). 
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Figure  41:  The  examination  of  Linear  Regression  Model  (LRM)  adequacy  based  on  various 
arima(p,0,q)  are  shown  here.  For  the  two-combined-flight-data,  the  “best  fit”  is 

arima(16,0,6). 
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Figure  42:  Check  Linear  Regression  Model  (LRM)  adequacy  with  R  built-in  function 
plot(lm{stats}).  This  is  an  example  for  three-combined-flight-data  with  arima(12,0,4). 
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Figure  43:  The  examination  of  Linear  Regression  Model  (LRM)  adequacy  based  on  various 
arima(p,0,q)  are  shown  here.  For  the  three-combined-flight-data,  the  “best  fit”  is 

arima(12,0,4). 
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Figure  44:  Check  Linear  Regression  Model  (LRM)  adequacy  with  R  built-in  function 
plot(lm{statsY).  This  is  an  example  for  four-combined-flight-data  with  arima(14,0,6). 
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Figure  45:  The  examination  of  Linear  Regression  Model  (LRM)  adequacy  based  on  various 
arima(p,0,q)  are  shown  here.  For  the  four-combined-flight-data,  the  “best  fit”  is 

arima(14,0,6). 
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Figure  46:  Check  Linear  Regression  Model  (LRM)  adequacy  with  R  built-in  function 
plot(lm{stats}).  This  is  an  example  for  five-combined-flight-data  with  arima(16,0,4). 
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Figure  47:  The  examination  of  Linear  Regression  Model  (LRM)  adequacy  based  on  various 
arima(p,0,q)  are  shown  here.  For  the  five-combined-flight-data,  the  “best  fit”  is 

arima(16,0,4). 
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Figure  48:  Check  Linear  Regression  Model  (LRM)  adequacy  with  R  built-in  function 
plot(lm{stats}).  This  is  an  example  for  six-combined-flight-data  with  arima(12,0,2). 
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Figure  49:  The  examination  of  Linear  Regression  Model  (LRM)  adequacy  based  on  various 
arima(p,0,q)  are  shown  here.  For  the  six-combined-flight-data,  the  “best  fit”  is 

arima(12,0,2). 
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Figure  50:  Check  Linear  Regression  Model  (LRM)  adequacy  with  R  built-in  function 
plot(lm{stats}).  This  is  an  example  for  seven-combined-flight-data  with  arima(16,0,2). 
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Figure  51:  The  examination  of  Linear  Regression  Model  (LRM)  adequacy  based  on  various 
arima(p,0,q)  are  shown  here.  For  the  seven-combined-flight-data,  the  “best  fit”  is 

arima(16,0,2). 
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R_squared  and  R-adj_$quared  vs  Data  (Combined  Flights) 
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Figure  52:  Engine  Stall  Probability  Model  Adequacy  Check  -  R?  and  of  “Best  Fit” 
Models  versus  the  number  of  combined  flights. 


These  are  the  “best  fit”  LRM  observed,  based  on  current  flight  data  samples,  and  utilizing 
^adj  =  1  —  (1  —  ^ ^  where  n  is  sampling  size,  p  is  number  of  regressor  (=  p  +  q  in 

arima(p,0,q)).  Once  again,  a  larger  sample  size  would  be  very  helpful  to  obtain  a  larger  if 
the  same  can  be  somewhat  retained. 
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Relevant  Research  -  Application  of  Weight  Filter  for  Times  Series  Data 

It  will  be  beneficial  to  use  a  Weight  Filter  in  order  to  achieve  a  more  concise  PACF 
and/or  ACF,  and  thereby  lower  the  order  of  arima(p,0,q)  to  assist  in  shortening  the  calculation 
run  times. 

For  example,  currently  using  a  normal  computer  (CPU  -  AMD  Athlon64/3400/2.2GHz 
Momory  -  2.0GB),  running  the  R  script  “Linear_Regression_Model_FOR_EngineStall_Scan.R” 
takes  32:58:24  for  the  7-combined-flights-data  using  the  22  fault-Negative  samples  plus  6  fault- 
Positive  samples.  Compared  to  the  6-combined-flights-data,  processing  those  28  samples  would 
take  28:21:56.  Finally  for  the  3-combined-flights-data,  processing  those  28  samples  would  only 
take  13:56:04. 

Three  types  of  Weight  filters  were  used  in  an  attempt  to  sharpen  the  time  series  data  in  an 
effort  to  achieve  the  stated  objective.  The  3  filter  types  were  experimented  in  arima  model  with  4 
combined-flights-data  of  VG. 

Unfortunately,  the  current  format  of  filters  used  did  not  yield  any  sign  of  reaching  the 
desired  objectives.  In  the  Future  Research  and  Discussion  section  of  this  study,  another  format  of 
filter,  the  Kalman  filter,  is  proposed  to  be  employed. 

For  a  centered  Convolution  filter  which  was  used,  the  common  mathematical  expression 
is  given  by: 

x.filter[i]  —  f[l]  *  x[i  +  o]  +  f[2]  *  x[i  +  (o  —  1)]  +  — h  f[(p  —  1)]  *  x[i  +  o  — 
p—2+/p*x[i+o—(p—l )]. 

Where  the  original  time  series  data  is  x[i],  and  the  filtered  time  series  data  is  x.filter\i\,  o  is  the 
offset  o=(p-l)/2,  and  p  is  the  number  of  filter  points  and  must  be  an  odd  number  under  the 
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convolution  filter  type. 


The  filter  itself  is  represented  by: 

/[i]  =  /[I]  *  x[i  +  o]  +  f[2]  *  x[i  +  (o  -  1)]  +  •■•  +  /[(p  -  1)]  *  x[i  +  0  -  (p  -  2)]  +  f[p] 
*  x[i  +  0  —  (p  —  1)] 

In  R,  it  is  implemented  as: 

X.  filter  <  —  filter(x,  c(f[l],  f  [2], ,  f[(p  —  l)],/[p],  sides  =  2) 

Where  the  filter  sharp  f[i]  is  formatted  as 

p 

t=i 

In  this  way,  the  filtered  data  will  meet  two  critical  requirements.  First  it  will  be  a  time 
series,  and  second  it  will  be  a  stationary  set  of  data  which  is  desired  for  an  ARIMA  model. 

The  results  of  the  applied  filter  exercises  are  provided  below. 

Filter  15 


For  a  15  point  filter,  the  following  shape  is  experimented: 
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The  mathematical  expression  of  the  filtered  time  series  data  is: 
x.filter[i]  =  /[I]  *  x[i  +  7]  +  f[2]  *  x[i  +  6]  +  — h  /[8]  *  x[i]  +  — h  /[14]  *  x[i  —  6] 
+  /[15]  *  x[i  —  7] 

The  three  types  of  filter  results  are  shown  in  figure  23. 
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Figure  53:  Shapes  of  the  three  filters  experiments. 
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Sequence 

dIff  Run-Order  Plot  Filter  16 


Figure  54:  The  effect  of  Filter  15  to  the  original  Variable  Geometry  (VG)  data  and  the  first 
order  differenced  VG  data  which  includes  four  flights. 
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Figure  55:  The  effect  of  Filter  15  to  the  Autocorrelation  Function  and  Partial 
Autocorrelation  Function  of  the  first  order  differenced  Variable  Geometry  (VG)  data 

which  includes  four  flights. 


62 
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%  of  Significant  coef[k]  vs  arima(p,0,q) 


Figure  56:  The  effect  of  Filter  15.  ARIMA(p,0,q)  model  scan  of  ARl-16  and  MAl-6  for  four 
flights  of  first  order  differenced  Variable  Geometry  (VG)  data. 
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LeftdiffVG((j»g)  Left  ts  VS  [dog] 


Filter  31 


For  a  3 1  point  filter,  the  following  shape  is  experimented: 
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Where  i  =1,2,..., 30,31. 

The  mathematical  expression  of  the  filtered  time  series  data  is: 
x.filter[i]  =  /[I]  *  x[i  +  15]  +  f[2]  *  x[i  +  14]  +  — h  /[16]  *  x[i]  +  — h  /[30]  *  x[i  —  14] 
+  /[31]  *  x[i  —  15] 
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Figure  57:  The  effect  of  Filter  31  to  the  original  Variable  Geometry  (VG)  data  and  the  first 
order  differenced  VG  data  which  includes  four  flights. 
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Figure  58:  The  effect  of  Filter  31  to  the  Autocorrelation  Function  and  Partial 
Autocorrelation  Function  of  the  first  order  differenced  Variable  Geometry  (VG)  data 

which  includes  four  flights. 
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Figure  59:  The  effect  of  Filter  31.  ARIMA(p,0,q)  model  scan  of  ARl-16  and  MAl-6  for  four 
flights  of  the  first  order  differenced  Variable  Geometry  (VG)  data. 
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Filter  61 


For  a  61  point  filter,  the  following  shape  is  experimented: 

11111112222222 

T^'l^’T^’l^’l^’l^’l^’l^’l^'l^’l^’T^’l^’l^’ 

33333334444444 

T^’i^’T^’i^’T^’T^’T^’i^’T^’T^’T^’i^’T^’i^’ 

5  5  5  5  5 

44444443333333 

T^’i^’T^’i^’T^’i^’T^’i^’T^’i^’T^’i^’T^’i^’ 

22222221111111 

165  ’  165  ’  165  ’  165  ’  165  ’  165  ’  T^’ T^’ T^’ T^’ T^’ T^’ T^’ 
Where  i  =  1 ,2, ...  ,60,6 1 . 


The  mathematical  expression  of  the  filtered  time  series  data  is: 
x.filter[i]  —  /[I]  *  x[i  +  30]  +  f[2]  *  x[i  +  29]  +  — h  /[31]  *  x[i]  +  — h  /[60]  *  x[i  —  29] 
+  /[61]  *x[i  -  30] 
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Figure  60:  The  effect  of  Filter  61  to  the  original  Variable  Geometry  (VG)  data  and  the  first 
order  differenced  VG  data  which  includes  four  flights. 
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Figure  61:  The  effect  of  Filter  61  to  the  Autocorrelation  Function  and  Partial 
Autocorrelation  Function  of  the  first  order  differenced  Variable  Geometry  (VG)  data 

which  includes  four  flights. 
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Figure  62:  The  effect  of  Filter  61.  ARIMA(p,0,q)  model  scan  of  ARl-16  and  MAl-6  for  four 
flights  of  the  first  order  differenced  Variable  Geometry  (VG)  data. 
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III.  METHODOLOGY 


Overview 

The  purpose  of  this  chapter  is  to  demonstrate  that  the  calculated  engine  compressor  stall 
model  can  be  an  effective  predictive  maintenance  indicator  for  the  data  used  from  the  historical 
General  Electric  TF34-100  engine  data  repository.  This  study  demonstrates  that  the  probability 
model  developed  is  adequate  according  to  the  theoretical  model  adequacy  checks. 

Test  Subjects 

Once  again,  the  engine  sensor  data  is  a  Time  Series  data.  This  raw  data  is  collected  from 
the  RTED  data  recorders  of  the  General  Electric  TE34-100  engines,  which  are  installed 
throughout  the  A- 10  feet.  The  flight  sensor  raw  data  is  used  to  calculate  the  VG  data  and  first 
order  differenced  VG  data,  which  then  is  fitted  with  an  arima(p,0,q)  model.  From  the  results  of 
the  ARIMA  modeling,  a  probability  model  is  then  fitted  to  determine  the  likelihood  of  a 
compressor  stall  occurring. 

To  ensure  that  a  high  fidelity  is  maintained  in  the  study,  much  effort  went  into  the 
elimination  or  reduction  of  the  impact  from  two  known  variance  issues.  First  is  the  sensor 
variance.  This  is  an  inherent  variance  associated  with  the  nature  of  real  world  engineering  data. 
Secondly  is  the  modeling  variance.  This  variance  is  related  to  the  adequacy  of  the  proposed 
model. 

The  following  strategic  steps  were  designed  within  the  research: 

First  is  to  collect  sample  data  and  conduct  a  primary  investigation.  This  includes  making 
observations  using  existing  tools  to  observe  which  events  will  be  detected  after  flight.  The  data 
needs  to  be  filtered  and  manipulated.  This  is  to  filter  out  or  reduce  the  extent  of  sensor  failure. 
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sensor  normal  noise,  and  sensor  abrupt  noise.  Next  is  developing  a  model  by  trial  and  error. 

Here  an  adequate  ARIMA  model  is  identified  with  its  necessary  R  scripts  such  as: 

Xf-  =  S  +  +  AR2Xf-_2  +  — f  ^Rp^t-p  +  “  MA2Af-_2  - - MAqAf-_q 

Where  X^  is  the  VG  value  at  the  time  t,  ARp  is  the  AutoRegression  (AR)  coefficient,  and  MAq  is 
the  Moving  Average  (MA)  coefficient.  This  model  is  denoted  as  arima(p,0,q). 

Next  is  the  collection  of  a  large  population  data  and  to  develop  the  associated  model. 

This  includes;  identify  the  model  variables  and  select  a  proper  ARIMA  model,  estimate  the 
ARIMA  parameters,  check  for  model  adequacy,  and  fit  a  linear  regression  probability  model  for 

compressor  stall  symptom.  This  model  is  given  by: 

p  q 

p{EngineStall\Nfiigf,ts)  =  po +  ^Pi  *  +  ^  Pp+j  *  MAj  +  e 

i=l  j=l 

Where  indicates  number  of  flights  data  used,  currently,  A  =  1,  2,  ...  7.  ARi  is  the  AR 

coefficients  (/  =  1,  2,  . . .  p)  and  MAj  is  the  MA  coefficients  (j  =  \,  2,  ...  q). 

This  model  of  compressor  stall  fault  event  is  based  on  the  data  as  described  below: 
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Where  the  response  variable  (independent  variable)  (yi,  y2,  ...  ym)  indicates  compressor  stall  fault 
event  did  occur  for  engine  #1  to  engine  #m,  and  (ym+i,  ■  •  -yn)  indicates  no  stall  fault  event 
occurred  for  engine  #(m+l)  to  engine  #n;  ARij,  (i  =  \,  2,  ...  p;  k  =  \,  2,  ...  n)  and MAjt (j  =  1, 

2,  ...  q,  A:  =  1,  2,  ...  n)  are  the  repressor  variables  (explanatory  variables)  which  are  AR 


70 


coefficients  and  MA  coefficients  of  the  fitted  arima(p,0,q)  model  from  the  each  individual  set  of 
the  first  order  differenced  VG  data. 

Currently,  the  m  =  6,  n  =  28.  That  is,  among  the  14  aircraft  studied  (equipped  with  28  engines),  6 
engines  have  compressor  stall  fault  events  out  of  the  28  total  engines. 

Summary 

By  fitting  an  arima(p,0,q)  model  from  the  RTED  data  for  each  individual  engine  of  the 
A- 10  fleet,  a  Probability  Model  for  Compressor  Stall  Fault  Event  has  been  established,  which 
can  predict  the  probability  of  the  compressor  stall  to  that  particular  engine  occurring  during  the 
next  flight.  If  the  probability  is  high,  an  engine  preventive  maintenance  will  be  recommended, 
thus  avoiding  the  potential  costs  resulting  from  the  compressor  stall  damage,  and  the  associated 
pilot  safety  issues. 
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IV.  ANALYSIS  AND  RESULTS 


Results  of  arima(p,0,q)  Model 

The  Model  Adequacy  Check  for  arima(p,0,q)  model  of  a  first  order  differenced  VG  data 
shows  very  good  results  (Refer  to  Figure  29  -  Figure  37).  The  Normal  Probability  Plot  also 
illustrates  a  very  good  fitness  for  the  (-2,2)  Normal  Score  range  of  data.  (The  observed  tails  may 
be  related  to  the  existing  data  during  take-off  period  and  during  the  landing  period.)  Two  major 
steps  have  been  taken  to  ensure  the  arima(p,0,q)  adequacy. 

First,  an  R  script  is  developed  to  check  the  4  performance  parameters  of  a  to-be-specified 
arima(p,0,q)  over  various  combined-flight-data  .  The  performance  parameters  are:  Sigma^Z, 
Log  —  Likelihood,  AIC,  and  Percentage  of  Significant  Coefficients  (PSC).  (See  Figure  22, 
Figure  23,  Figure  24,  Figure  25,  Figure  26,  Figure  27,  and  Figure  28.) 

Second,  another  R  script  is  developed  to  check  the  residuals  of  a  specified  arima(p,0,q) 
model.  The  following  plots  are  checked:  Ljung-Box  test  p-value  vs  Lag;  Residuals  Sequence, 
ACF  of  Residuals,  Histogram  of  Residuals,  Normal  Probability  Plot  of  Residuals,  and 
Standardized  Residuals. 

The  results  of  the  examinations  mentioned  above  are  very  promising.  No  major 
obstruction  was  observed.  During  the  process,  arima(20,0,6)  was  selected  as  the  desired  ARIMA 
model  to  be  applied.  The  imposed  time-constraints  lead  to  the  selection  of  a  total  of  28  sample 
engines,  which  turned  out  to  be  smaller  than  the  desired  sample  size. 
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Results  of  Linear  Regression  Model  (LRM) 

The  Model  Adequacy  Check  for  the  Probability  Model  of  the  Compressor  Stall  Fault 
Event  is  accomplished  in  two  parts. 

First,  R  built-in  functions  for  linear  regression  plot(lm{ stats})  are  used,  which  compare 
the  4  plots  as:  Residuals  vs  Fitted  Values,  Scale  -Focation,  Normal  Probability,  and  Residuals  vs 
Leverage  (Cook’s  Distance).  For  examples  of  these  refer  to:  Figure  38,  Figure  40,  Figure  42, 
Figure  44,  Figure  46,  Figure  48,  Figure  50,  and  Figure  52. 

Second,  the  Response  vs  Estimated  Response  plots  (y  vs  y-hat  plot)  are  checked  as 
shown  below  as  an  intuitive  visualization  presentation.  Blue  dots  with  dashed  lines  illustrate  the 
original  responses  which  are  detected  by  ASIST.  The  black  points  with  solid  lines  illustrate  the 
estimated  response  by  a  fitted  LRM.  The  order  of  arima(p,0,q)  used  and  the  resulting  modeling 
parameters  are  also  shown  on  each  of  the  exemplary  plots.  Refer  to  Figure  63,  Figure  64,  and 
Figure  65. 
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y[BLU],  y_hat[BLK] 


Linear  Regression  Model  Based  On  arima(12,0,4)  For  1  Combined  Fiight(s)  Data 


Data_Pornts 


Figure  63:  Check  the  goodness  of  Fit  of  proposed  Prohahility  Model  for  Compressor  Stall 
Fault  Event.  This  plot  is  a  Linear  Regression  Model  (LRM)  based  on  arima(12,0,4)  for  one- 

Flight-Data. 
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y[BLU],  y_hat[BLK] 


Linear  Regression  Model  Based  On  arima(14,0,6)  For  4  Combined  Fiight(s)  Data 


Data_Pornts 


Figure  64:  Check  the  goodness  of  Fit  of  proposed  Prohahility  Model  for  Compressor  Stall 
Fault  Event.  This  plot  is  a  Linear  Regression  Model  (LRM)  based  on  arima(14,0,6)  for 

four-Comhined-Flight-Data. 
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Linear  Regression  Model  Based  On  arima(16,0,2)  For  7  Combined  Fiight(s)  Data 


Data_Points 


Figure  65:  Check  the  goodness  of  Fit  of  proposed  Probability  Model  for  Compressor  Stall 
Fault  Event.  This  plot  is  a  Linear  Regression  Model  (LRM)  based  on  arima(16,0,2)  for 

seven-Combined-Flight-Data. 


These  plots  demonstrate  strong  evidenee  that  when  more  flight  data  is  used  to  fit  an 
arima(p,0,q)  model,  the  higher  order  of  an  AR  would  lead  to  a  better  LRM.  This  agrees  with  the 
findings  of  the  PACT  analysis  mentioned  in  the  previous  seetions.  (See  Figure  13,  Figure  14,  and 
Figure  15.)  Utilizing  an  arima(20,0,6)  model  indicates  a  greater  LRM,  but  current  sample  size 
impositions  have  artificially  limited  that  option.  Mathematically,  when  considering  the  equation 

=  1  - (1  - R") 


''  "'n-p-l 

Where  its  number  of  repressor  is  a  combination  of  terms  (p+q)  in  arima(p,0,q),  it  becomes  clear 
that  a  larger  sample  size  would  lead  to  better  Radj  iii  LRM. 
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For  example,  if  =  0. 80  and  =  0. 75  are  the  desired  objectives  and  the  intent  is 
to  use  the  arima(20,0,6)  model,  then  the  sampling  size  of  n  =  131  would  be  required. 

Another  example,  if  R^  —  0. 85  and  R\^j  —  0. 70  are  the  desired  objectives  and  the 
intent  is  to  use  the  arima(  12,0,4)  model,  then  the  sampling  size  of  n  =  49  would  be  required. 


77 


Investigative  Questions  Answered 

How  to  sample  the  flight  sensor  data? 

The  Nellis  downloaded  data  and  additional  flight  data  from  the  JRAMS  website  were  obtained 
for  review.  A  systematic  review  of  the  flight  data  was  implemented  following  execution  of  the 
fleet  wide  A- 10  Engine  Data  Record  Program. 

How  to  derive  a  response  indicator  (independent  variable)  in  order  to  establish  a 
regression  model? 

The  existing  ASIST  software  is  used  to  identify  Compressor  Stall  fault  events  for  all  the  flight 
data  files  from  the  28  engines  of  14  aircrafts. 

Is  the  ARIMA  model  adequate  for  the  sensor  data? 

The  research  proves  that  the  PSC  (percentage  of  significant  coefficients)  of  the  arima(20,0,6) 
model  is  very  high,  and  the  Ljung-Box  test;  the  normal  probability  plot  of  residuals  and  ACE  of 
residuals  also  have  shown  very  good  results.  Thus  it  is  believed  that  a  good  ARIMA  model  was 
developed  from  the  first  order  differenced  VG  data. 

How  do  we  predict  the  fault  event  probability  by  utilizing  the  ARIMA  model? 

A  choice  was  made  to  use  the  retrieved  arima(p,0,q)  model  coefficients  of  first  order  differenced 
ts(VG)  data  in  order  to  establish  the  linear  regression  probability  model.  This  model 
demonstrated  a  very  proficient  predicting  capability. 
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V.  CONCLUSIONS  AND  RECOMMENDATIONS 


Conclusions  of  Research 

First,  this  study  has  successfully  developed  a  method  to  use  the  proposed  ARIMA-LRM 
method  to  predict  engine  compressor  stall  fault  based  on  existing  real  time  performance  data 
(RTED)  for  the  A- 10  TF34-100  engines. 

Second,  this  modeling  can  now  lead  to  significant  maintenance  cost  savings  for  the 
USAF,  increase  the  A-lO’s  reliability  and,  subsequently,  the  pilot’s  safety. 

Significance  of  Research 

This  is  the  first  time  a  method  has  been  proposed  to  predict  the  compressor  stall  fault 
event  probability  of  an  individual  A- 10  TF34-100  turbofan  engine  based  on  RTED  data. 

Recommendations  for  Future  Actions 

Establish  an  Engine  Data  Analysis  team  to  monitor,  track,  and  analyze  A- 10  fleet-wide 
engine  data. 

Improve  the  proposed  Probability  Model  for  Compressor  Stall  Eault  Event  by  utilizing  all 
the  available  data  for  a  larger  sample  size  and  a  longer  time  period. 

Conduct  a  comparison  of  “Cost  of  Maintenances  with  Predicting  Implemented”  to  “Cost 
of  Maintenances  without  Predicting  Implemented”. 


Recommendations  for  Future  Research 

Investigate  the  use  of  a  Kalman  Filter  for  the  ability  to  suppress  variance,  and  a  better  model. 


79 


Use  a  Fast  Fourier  Transform  to  achieve  higher  computation  efficiency.  Investigate  other 
perspectives  to  achieve  better  models. 

:  Investigate  whether  a  multivariate  approach  would  yield  a  better  model  by  adding  more 
sensors  into  the  explanatory  variables. 

Use  Bayesian  Statistics  to  test  the  trustworthiness  of  the  model. 

Study  the  effects  of  adding  more  engine  faults  into  the  response  variables  in  an  effort  to  gain 
a  better  understanding  of  the  engine  system.  This  would  be  a  more  comprehensive  undertaking, 
and  the  complexity  of  work  would  be  increased  exponentially. 


Summary 

An  ARIMA-LRM  method  has  now  been  developed  to  predict  engine  compressor  stall 
fault  based  on  real  time  performance  data  (RTED).  The  ARIMA-LRM  results  for  1  flight-data, 
3-  and  4-combined-flights-data  are  very  good.  The  ARIMA-LRM  results  for  2-,  5-,  6-  and  7- 
combined-flights-data  are  not  as  desirable.  This  model  can  be  exploited  to  achieve  a  great 
benefit  in  cost  savings  for  engine  maintenance  repairs,  longer  engine  lifetimes,  and  also  in 
increased  aircraft/pilot  safety.  That  is,  this  method  can  save  many  millions  of  maintenance 
dollars,  increase  engine/mission  reliability,  and  ensure  a  greater  safety  for  A- 10  pilots. 
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APPENDIX  A.  THE  FORMULA  TO  CALCULATE  VARIABLE  GEOMETRY  (VG) 


Variable  Geometry  (VG)  Schedule  Calculation  Formula 


NG  , 

NGC  =  , 

V(r2C  +  273.15)/288.15 

NGC  Range  (%) 

VG  =  degree) 

Note 

37.7  °C  5:T2C 5^23.8  °C 

(56.1798,  69.07301 

IGV  -  0.556883  »  NGC  +  93.385559 

Formula  #1 

(69.0730,  73.0337) 

IGV  +  1.355821  .  NGC  -  148.570620 

Formula  M2 

(73.0337,  77.0056) 

/GV+  (0.861517*NGC-112.469742)  +  {(1.005462)* NGC  +  (-71.408119)}  * 
(T2C-23.888889)/13.888891 

Formula  M3 

(77.0056,  78.6517) 

IGV+  (0.861517* NGC-112.469742)  +  {(0.577181)* NGC  +  (-38.428047)}  * 
(T2C-23.888889)/13.888891 

Formula  M4 

(78.6517,  82.7191) 

(GV  +  (2.276908*NGC-223.792650)  +  {(-0.838210)* NGC  4  (72.894861)}  * 
(T2C-23.888889)/13.888891 

Formula  M5 

(82.7191,  85.3933) 

(GV  4  (2.276908*NGC-223.792650)  4  {(-1.303533)*NGC  4  (111.385942)}  * 
(T2C-23.888889)/13.888891 

Formula  M6 

(85.3933,  93.3933) 

IGV  4  (0.973375*NGC-113.258917)  4  {(0.000000)*NGC 4  (0.852209)}  *  (T2C- 
23.888889)/13.888891 

Formula  M7 

T2C<23.8  °C 

(56.1798,  69.0730) 

IGV  +  0.556883  .  NGC  -  93.385559 

Formula  M8 

(69.0730,  73.0337) 

IGV  + 1.355821  •  NGC  - 148.570620 

Formula  M9 

(73.0337,  78.6517) 

IGV  +  0.861517  »  NGC  -  112.469742 

Formula  MIO 

(78.6517,  85.3933) 

IGV  +  2.276908  *  NGC  -  223.792650 

Formula  Ml  1 

(85.3933,93.3933) 

IGV  -1-  0.982  .  NGC  -  113.258917 

Formula  M12 

(93.3933,  100%) 

IGV -21.5 

Formula  M13 

T2037.7  “C 

(56.1798,  69.0730) 

IGV  +  0.556883  .  NGC  -  93.385559 

Formula  M 14 

(69.0730,  77.0056) 

IGV  +  1.866979  .  NGC  -  183.977861 

Formula  M15 

(77.0056,  82.7191) 

IGV  +  1.438698  *  NGC  -  150.897789 

Formula  Ml  6 

(82.7191,93.3933) 

IGV  +  0.973375  •  NGC  - 112.406708 

Formula  Ml  7 

(93.3933,  100%) 

IGV -21.5 

Formula  M18 

What  to  do  if  NGC 

What  to  do  if  NGC 

<56.1798 

>93.3933 
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